Checkpointing with Minimal Recover in Adhocnet based TMR

نویسنده

  • Sarmistha Neogy
چکیده

This paper describes two-fold approach towards utilizing Triple Modular Redundancy (TMR) in Wireless Adhoc Network (AdocNet). A distributed checkpointing and recovery protocol is proposed. The protocol eliminates useless checkpoints and helps in selecting only dependent processes in the concerned checkpointing interval, to recover. A process starts recovery from its last checkpoint only if it finds that it is dependent (directly or indirectly) on the faulty process. The recovery protocol also prevents the occurrence of missing or orphan messages. In AdocNet, a set of three nodes (connected to each other) is considered to form a TMR set, being designated as main, primary and secondary. A main node in one set may serve as primary or secondary in another. Computation is not triplicated, but checkpoint by main is duplicated in its primary so that primary can continue if main fails. Checkpoint by primary is then duplicated in secondary if primary fails too.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Enhanced MSS-based checkpointing Scheme for Mobile Computing Environment

Mobile computing systems are made up of different components among which Mobile Support Stations (MSSs) play a key role. This paper proposes an efficient MSS-based non-blocking coordinated checkpointing scheme for mobile computing environment. In the scheme suggested nearly all aspects of checkpointing and their related overheads are forwarded to the MSSs and as a result the workload of Mobile ...

متن کامل

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...

متن کامل

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...

متن کامل

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

Coherence-based Coordinated Checkpointing for Software Distributed Shared Memory Systems

Fault-tolerant techniques that can cope with system failures in software distributed shared memory (SDSM) are essential for creating productive and highly available parallel computing environments on clusters of workstations. In this paper, we propose a new, efficient coordinated checkpointing technique, called coherence-based coordinated checkpointing (CCC), for SDSM. Our CCC minimizes both th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1511.03202  شماره 

صفحات  -

تاریخ انتشار 2015